Alloyed Global and Local Branch History: A Robust Solution to Wrong-History Mispredictions
نویسندگان
چکیده
The need for accurate conditional-branch prediction is well known: mispredictions waste large numbers of cycles, inhibit out-of-order execution, and waste power on mis-speculated computation. Prior work on branch-predictor organization has focused mainly on how to reduce conflicts in the branch-predictor structures, while relatively little work has explored other causes of mispredictions. Some prior work has identified other categories of mispredictions, but this paper organizes these categories into a broad taxonomy of misprediction types. Using the taxonomy, this paper goes on to show that other categories—especially wrong-history mispredictions—are often more important than conflicts. This is true even if just a very simple conflict-reduction technique is used. Wrong-history mispredictions arise because current two-level, history-based predictors provide only global or only local history. Their contribution to the overall misprediction rate is substantial because most programs have some branches that require global history and also some that require local history. If only one or the other type of history is available, many branches are therefore penalized. For SPECint95 programs using a global-history predictor, wrong-history mispredictions account for 35–50% of the total misprediction rate. By comparison, conflicts only account for 15-20%. Hybrid predictors are one proposed solution; they use both a global-history and a local-history component. Unfortunately, hybrid predictors only work well with large hardware budgets, because they must subdivide the available area into subcomponents. Based on these observations, this paper proposes alloying local and global history together in a twolevel branch predictor structure. This is a simple but superior technique for making global and local history simultaneously available and eliminating wrong-history mispredictions. Unlike hybrid prediction, alloying gives robust performance for branch-predictor hardware budgets ranging from very large to very small. Alloying also consistently outperforms other two-level organizations. In fact, a small alloyed predictor often performs as well as a much larger global-history predictor.
منابع مشابه
Alloying Global and Local Branch History: Taxonomy, Performance, and Analysis
The need for accurate conditional-branch prediction is well known: mispredictions waste large numbers of cycles and also waste power on mis-speculated computation. A number of studies have explored ways to improve the prediction accuracy of two-level predictors, but have considered exclusively global or local history. Because most programs benefit from having both global and local history avail...
متن کامل2FAR: A 2bcgskew Predictor Fused by an Alloyed Redundant History Skewed Perceptron Branch Predictor
This paper describes the 2bcgskew branch predictor fused by an alloyed redundant history skewed perceptron predictor, which is our design submitted to the 1st JILP Championship Branch Prediction (CBP) competition. The presented predictor intelligently combines multiple predictions (fusion) in order to obtain a more accurate prediction. The various predictions are delivered by a 2bcgskew predict...
متن کاملSpeculative Updates of Local and Global Branch History: A Quantitative Analysis
In today’s wide-issue processors, even small branch-misprediction rates introduce substantial performance penalties. Worse yet, inadequate branch prediction creates a bottleneck at the fetch stage, restricting other opportunities for improving performance. The choice of how to predict conditional-branch outcomes is the primary lever on prediction accuracy. But the choice of when to update the p...
متن کاملCharacterizing and Removing Branch Mispredictions
Control-flow mispredictions are a profound impediment to processor performance, because each misprediction introduces a pipeline bubble of many cycles’ duration. For example, the minimum bubble in the recently released Alpha 21264 is at least seven cycles, and often as much as twenty cycles. With such long penalties, even small misprediction rates harm performance substantially. Although a huge...
متن کاملA Branch Predictor with New Recovery Mechanism
To improve the performance of wide-issue superscalar processors, it is essential to increase the instruction fetch and issue rate. Removal of control hazard has been put forward as a significant new source of instruction level parallelism for superscalar processors and the conditional branch prediction is an important technique for improving processor performance. Branch mispredictions waste a ...
متن کامل